Assessing Spatial Stationarity and Segmenting Spatial Processes
into Stationary Components
ShengLi Tzenga, Bo-Yu Chenb, Hsin-Cheng Huangc,∗
aDepartment of Applied Mathematics, National Sun Yat-Sen University, Taiwan.
bDepartment of Statistics, Purdue University, USA.
cInstitute of Statistical Science, Academia Sinica, Taiwan.
Abstract
In this research, we propose a novel technique for visualizing nonstationarity in geostatistics,
particularly when confronted with a single realization of data at irregularly spaced locations.
Our method hinges on formulating a statistic that tracks a stable microergodic parameter
of the exponential covariance function, allowing us to address the intricate challenges of
nonstationary processes that lack repeated measurements. We implement the fused lasso
technique to elucidate nonstationary patterns at various resolutions. For prediction pur-
poses, we segment the spatial domain into stationary sub-regions via Voronoi tessellations.
Additionally, we devise a robust test for stationarity based on contrasting the sample means
of our proposed statistics between two selected Voronoi subregions. The effectiveness of our
method is demonstrated through simulation studies and its application to a precipitation
dataset in Colorado.
Keywords:
Fused lasso, Geostatistics, irregularly spaced data, microergodic parameter,
nonstationary spatial process, spatial clustering, spatial visualization, stationarity test,
Voronoi tessellation
∗Corresponding author
Email address: hchuang@stat.sinica.edu.tw (Hsin-Cheng Huang)
1
arXiv:2210.08231v2  [stat.ME]  28 Aug 2023

1. Introduction
Consider a spatial process {y(s) : s ∈D} of interest defined on a region D ⊂R2. Suppose
that we observe data z ≡(z(s1), . . . , z(sn))′ at n spatial locations, which may be irregularly
spaced, according to the measurement equation:
z(si) = y(si) + e(si);
i = 1, . . . , n,
(1)
where e(s1), . . . , e(sn) ∼N(0, τ 2) are white-noise variables, representing measurement errors.
A major problem in geostatistics, called kriging, is to predict y(s0) at any location s0 ∈D
based on z. For simplicity, we assume that the mean function of the process y(·) is known
and, without loss of generality, zero.
Then for a given covariance function of y(·), the
ordinary-kriging predictor of y(s0) is
ˆy(s0) =

c + 1 −c′Σ−11
1′Σ−11
1
′
Σ−1Z,
(2)
where c ≡cov(z, y(s0)), Σ ≡var(z), and 1 = (1, . . . , 1)′.
Given a realization noisy data z at n locations, it is typical to assume that the covariance
function of y(·) is stationary. A commonly used stationary covariance model is the isotropic
Mat´ern family (Mat´ern, 1986) given by
cov(y(s), y(s + u)) =
σ2
2ν−1Γ(ν)
√
2ν
α ∥u∥
ν
Kν
√
2ν
α ∥u∥

;
s, s + u ∈R2,
(3)
where Kν(·) is the modified Bessel function of the second kind of order ν, σ2 is a variance
parameter, and θ ≡(α, ν)′ consists of a scale parameter α and a smoothness parameter ν.
It’s important to emphasize that spatial covariance functions don’t always exhibit station-
arity. Sometimes, they can be markedly influenced by local conditions and topographical
variations, leading to substantial deviations from stationarity. Visualizing the nonstationary
attributes from a single dataset presents a challenge. For illustration, Figure 1(a1) depicts a
zero-mean stationary process with a Mat´ern covariance function. In contrast, Figure 1(a2)
presents a zero-mean piecewise stationary process characterized by two distinct Mat´ern co-
variance functions. While the nonstationarity in Figure 1(a2) is apparent, determining which
2

one among Figures 1(b1) and 1(b2) (consisting of 400 random samples from the processes
in Figures 1(a1) and 1(a2) respectively) exhibits nonstationarity is not straightforward.
Several approaches have been proposed for testing spatial stationarity. Fuentes (2005)
pioneered a frequency domain test for spatial samples on a regular grid. Jun and Genton
(2012) proposed a test that partitions the spatial domain into two non-intersecting fields
for irregularly spaced data. More recently, Bandyopadhyay and Rao (2017) unveiled a test
leveraging the Fourier transform in the frequency domain, catering to irregularly spaced
data. In addition, local indicators of spatial autocorrelation (LISA) have been proposed by
Anselin (1995) for lattice data. To our knowledge, there seems to be an absence of spatial
dependence indices crafted explicitly for irregularly sampled data within geostatistics.
In this study, we introduce a local statistic designed to highlight nonstationary character-
istics within geostatistical datasets. This is achieved by employing a robust local estimation
of a microergodic parameter inherent to the exponential covariance model. We leverage the
fused lasso methodology to illuminate nonstationary patterns across varying resolutions. To
enhance the accuracy of spatial predictions using stationary models, we segment the spa-
tial domain into homogenous sub-regions utilizing Voronoi tessellations. A rigorous test for
spatial stationarity is established by comparing the sample means of the estimated microer-
godic parameters between two Voronoi subregions. If the stationarity is violated, we further
partition D into K components {D1, . . . , DK} such that each process {y(s) : s ∈Dk} is
stationary, for k = 1, . . . , K. It’s worth mentioning that both Guinness and Fuentes (2015)
and Muyskens et al. (2022) have crafted techniques to segregate domain D into stationary
subregions. However, the approach by Guinness and Fuentes requires data on a regular grid,
while the method by Muyskens et al. (2022) necessitates a regular grid shape for their base
partition due to the algorithm’s reliance on circulant embedding. It’s important to note that
the choice of grid resolution influences segmentation and increases computational demands
when selecting among different resolutions.
The rest of this paper is organized as follows. In Section 2, we develop a statistic to
3

(a1)
(a2)
(b1)
(b2)
(c1)
(c2)
Figure 1: (a1) A zero-mean stationary process; (a2) A zero-mean nonstationary process; (b1) Data sampled
from the process in (a1) at 400 locations using simple random sampling; (b2) Data sampled from the process
in (a2) at 400 locations using simple random sampling; (c1) Local spatial indices based on the data in (b1);
(c2) Local spatial indices based on the data in (b2).
4

monitor spatial heterogeneity at multiple resolutions. The statistic is designed to estimate
a microergodic parameter of the exponential variogram with data that may be irregularly
spaced.
We then provide an approach to partition the domain D into K homogeneous
subregions using Voronoi tessellations (Voronoi, 1908). Section 3 gives the proposed test for
stationarity using a t-type statistic based on the Voronoi subregions obtained from Section 2
with K = 2. Some simulation results are given in Section 4. An application to a precipitation
dataset in Colorado is provided in Section 5. Finally, Section 6 concludes with a summary.
2. Segmenting spatial processes into stationary components
2.1. A statistic for local spatial dependence
First, we construct a statistic to monitor the local spatial dependence of y(·) around si,
for i = 1, . . . , n. To find local structure around si, we consider a neighborhood set of si:
Ni ≡{j : ∥sj −si∥≤r, j ̸= i};
i = 1, . . . , n,
(4)
where ∥·∥is the Euclidean distance and r > 0 is an appropriate radius. If y(·) is an isotropic
stationary process around si, then its variogram at distance h is
2γy,i(h) ≡E(y(s) −y(si))2, for h = ∥s −si∥≤r;
i = 1, . . . , n.
It follows that for j ∈Ni,
2γz,i(∥sj −si∥) ≡E
 (z(sj) −z(si))2
= 2γy(∥sj −si∥) + 2δijτ 2;
i = 1, . . . , n,
where δij ≡1 if i = j; 0 otherwise. To reflect the local behavior, it is desirable to consider
Ni’s with a small r. To obtain a statistic that is robust to outliers, we utilize a squared-root
transform and apply the following approximation formula (Cressie and Hawkins, 1980):
|z(sj) −z(si)|1/2
{2γz,i(∥sj −si∥)}1/4 ≈N
 21/4π−1/2Γ(3/4), 21/2{π−1/2 −π−1Γ(3/4)2}

;
sj ∈Ni.
(5)
We first assume that τ 2 is known, and consider a local exponential semi-variogram:
γy,i(h) = σ2
i (1 −exp(h/αi));
0 ≥h, i = 1, . . . , n,
5

parametrized by variance σ2
i and range parameter αi; i = 1, . . . , n. However, it is well known
that both σ2
i and αi are unidentifiable under the infill asymptotic framework (Zhang, 2004).
Instead, we focus on their ratio, σ2
i /αi, a microergodic parameter that can be consistently
estimated. Applying a Taylor expansion to (2γz,i(h))1/4 at h = 0, we obtain
γz,i(h)1/4 =

σ2
i (1 −exp(−h/αi)) + τ 2	1/4
= τ 1/2 + τ −3/2σ2
i h/(4αi) + O(h2);
i = 1, . . . , n.
Substituting τ 1/2 + τ −3/2σ2
i ∥sj −si∥/(4αi) above for γz,i(∥sj −si∥)1/4 in (5) leads to
E
|z(si) −z(sj)|1/2 −C1
C2∥si −sj∥

≈σ2
i /αi;
i = 1, . . . , n,
(6)
where C1 ≡21/2π−1/2Γ(3/4)(τ 2)1/4 and C2 ≡2−3/2π−1/2Γ(3/4)(τ 2)−3/4. Note that the left-
hand side of (6) depends on ∥sj −si∥, but the right-hand side does not. In addition, from
(5), for small ∥sj −si∥,
var
|z(si) −z(sj)|1/2 −C1
C2∥si −sj∥

≈
C3
∥sj −si∥2,
where C3 ≡2
 π−1/2 −π−1Γ(3/4)2
/C2. This motivates us to use the following weighted
average as our local spatial indices to monitor the heterogeneity of spatial dependence:
ξi ≡
1
|Ni| P
j∈Ni ωij
X
j∈Ni

ωij
|z(sj) −z(si)|1/2 −C1
∥sj −si∥

;
i ∈I,
(7)
where ωij ≡∥sj −si∥2, I ≡{i : |Ni| > 0, i = 1, . . . , n}, and |Ni| denotes the number of
elements in Ni . In practice, we recommend choosing r = {5|D|/(nπ)}1/2 in (4), so that
|Ni| ≈5 on average, for i ∈I. Figures 1(c1) and 1(c2) show the proposed local spatial
indices of (7) based on the data in Figures 1(b1) and 1(b2). respectively.
When τ 2 is unknown, we estimate it based on a linear extrapolation to the zero ordinate
of γz,i(·) at two small lags, determined by Pk ≡{(i, j) : d∗
k−1 < |si−sj| ≤d∗
k, i < j}; k = 1, 2,
where 0 = d∗
0 < d∗
1 < d∗
2. Specifically, we compute the robust semivariogram estimates of
Cressie and Hawkins (1980) based on pairs in Pk:
ˆγk =
 X
(i,j)∈Pk
|z(si) −z(sj)|1/2
mk
4
2(0.457 + 0.494/mk + 0.045/m2
k);
k = 1, 2,
(8)
6

where mk is the number of pairs on Pk; k = 1, 2.
Applying linear extrapolation while
imposing constraints for a nonnegative slope and intercept, we obtain
ˆτ 2 = max

0, ˆγ1 −d1 max

0, ˆγ2 −ˆγ1
d2 −d1

,
(9)
where dk = P
(i,j)∈Pk |si −sj|/mk is the average distance among pairs in Pk, for k = 1, 2.
Two distinct subregions are discernible from Figure 2. In contrast, there is no clear pattern
from Figure 3.
2.2. Multiresolution spatial visualization
We note from (5) that ξi is approximately Gaussian with E(ξi) ≈C2σ2
i /αi , for i ∈I.
Let n∗≡|I|, and without loss of generality, assume that I = {1, . . . , n∗}. If y(·) is globally
stationary, we have σ2
1 = · · · = σ2
n∗and α1 = · · · = αn∗, and hence E(ξ1) ≈· · · ≈E(ξn∗). We
can apply a spatial-clustering approach to segment D into stationary components based on
ξ ≡(ξ1, . . . , ξn∗)′.
To explore the spatial nonstationarity evident in the data presented in Figure 1(b2) at
various resolutions, we decompose D into disjoint Voronoi cells, denoted as {B1, . . . , Bn∗},
corresponding to {s1, . . . , sn∗}.
Each point in the Voronoi cell Bk is closer to sk than
any other point in the set {s1, . . . , sn∗}. Then we cluster {B1, . . . , Bn∗} into homogeneous
components using the fused lasso (Tibshirani et al., 2005):
n∗
X
i=1
(ξi −βi)2 + ρ
X
(j,k)∈E
|βj −βk|.
where E is obtained by linking between any two cells that share a boundary and ρ ≥0
is a regularization parameter. The resulting images at multiple resolutions with different
tuning parameter values of ρ are shown in Figure 2.
The corresponding images for the
data in Figure 1(b1) from a stationary process are shown in Figure 3. Comparing the two
collections of images, it is evident that Figure 2 showcases at least two major components
with distinct values across different resolutions. In contrast, Figure 3 consistently exhibits a
single prominent homogeneous component throughout all resolutions.
7

Figure 2: Multiresolution spatial visualization for a piecewise stationary process using the proposed fused
lasso method.
Figure 3: Multiresolution spatial visualization for a stationary process using the proposed fused lasso method.
8

2.3. Voronoi tessellations into stationary components
The fused lasso approach given in the previous section is used for visualization.
We
aim to partition the region D into stationary components for subsequent spatial prediction.
To achieve this, we utilize the Voronoi subregions constructed from K seeds denoted as
{p1, . . . , pK} ⊂R2. Given these seeds, we derive the corresponding Voronoi tessellation,
subdividing D into K distinct components D1, . . . , DK.
Define nk ≡P
i∈I I(si ∈Dk);
k = 1, . . . , K.
Let SK be the set of all possible K seeds.
To identify the optimal set
of K seeds {p1, . . . , pK} from Sk, we apply the following objective function grounded in
independent normal likelihood:
fK(p1, . . . , pK; ξ) =
K
X
k=1
X
i∈I:si∈Dk

log(ˆvk(Dk)) −log ϕ
ξi −¯µk(Dk)
ˆvk(Dk)

,
(10)
where
¯µk(Dk) ≡1
nk
X
i∈I:si∈Dk
ξi,
and
ˆv2
k(Dk) ≡1
nk
X
i∈I:si∈Dk
(ξi −¯µk(Dk))2,
are the maximum likelihood (ML) estimators of the mean and the variance of {ξi : si ∈
Dk, i ∈I}, for k = 1, . . . , K, and ϕ(·) is the probability density function of the standard
normal distribution. The proposed segmentation of D into K ≥2 components is determined
by
 ˆp(K)
1
, . . . , ˆp(K)
K
	
≡
arg min
{p1,...,pK}∈SK
fK(p1, . . . , pK; ξ)
(11)
with the corresponding Voronoi tessellation ˆD(K)
1
, . . . , ˆD(K)
K .
We propose a simple algorithm to find the solution of (11). Its pseudo-code is outlined
in Algorithm 1.
3. The proposed test for stationarity
We can utilize the proposed segmentation with K = 2 to test for spatial stationarity. We
consider the following hypothesis test:
H0 : y(·) is stationary versus H1 : y(·) is not stationary.
9

Algorithm 1 Find the solution of (11) with a given K based on ξ.
Require:
{p1, . . . , pK}: the initial seeds obtained from a deterministic K-means algorithm of Nid-
heesh et al. (2017);
{D1, . . . , DK}: the Voronoi tessellations corresponding to {p1, . . . , pK}.
Ensure:
repeat
for k ←1 to K do
update pk by replacing it from {si ∈Dk : i = 1, . . . , n∗} such that fK(p1, . . . , pK; ξ)
is minimized;
update {D1, . . . , DK} corresponding to the current seeds {p1, . . . , pK};
end for
until no further reduction of fK(p1, . . . , pK; ξ) is possible.
Based on the two subregions ˆD(2)
1
and ˆD(2)
2
selected by (11) with K = 2, we propose the
following two-sample t statistic:
T =
¯µ1( ˆD(2)
1 ) −¯µ2( ˆD(2)
2 )

q
ˆv2
1( ˆD(2)
1 )/(ˆn1 −1) + ˆv2
2( ˆD(2)
2 )/(ˆn2 −1)
,
(12)
where ˆnk ≡Pn∗
i=1 I(si ∈ˆD(2)
k ); k = 1, 2. The distribution of T is complicated because there
is a selection process involved in obtaining ˆD(2)
1
and ˆD(2)
2 . So, we apply a Monte Carlo (MC)
method to find the null distribution of T. Specifically, we assume that under H0, y(·) is a
Gaussian process with the isotropic Mat´ern covariance model of (3). We estimate θ and σ2
in (3) by ML. The ML estimator of θ = (α, ν)′ can be obtained by minimizing the negative
log profile likelihood:
ˆθ ≡
 ˆα, ˆν
′ ≡arg min
θ
1
2 log |Ω(θ)| + K
2 log

z′Ω(θ)−1z
	
+ constant

,
(13)
where Ω(θ) is an n × n correlation matrix whose (i, j)-th entry is {cov(y(si), y(sj)) +
τ 2δij}/σ2. Then the ML estimator of σ2 is given by:
ˆσ2 ≡1
nz′Ω(ˆθ)−1z.
(14)
To implement the proposed MC method, first, we simulate data z(m), for m = 1, . . . , M,
based on (1) and (3) with θ and σ2 replaced by ˆθ in (13) and ˆσ2 in (14). Next, we compute
10

Tm in (12) based on z(m). Then the MC p-value of the proposed test is
ˆp =
1
M + 1
M
X
m=1
I(Tm > T).
(15)
Although we introduce the segmentation method before hypothesis testing, in practice, we
first perform the stationarity test and obtain the p-value ˆp of (15). We use a stationary model
for subsequent analysis if ˆp ≥0.05. Otherwise, we apply the proposed spatial segmentation
method to partition D into K stationary subregions with K selected by minimizing Bayesian
information criterion (BIC) (Schwarz, 1978):
BIC(K) = fK
  ˆp(K)
1
, . . . , ˆp(K)
K

+ 4K log(n∗).
(16)
4. Simulation studies
4.1. Testing stationarity
We examined the size of the proposed stationarity test under H0 by performing the same
simulation experiment as in Section 7.1.1 of Bandyopadhyay and Rao (2017). We considered
a zero-mean spatial process {y(s) : s ∈D} on a region D = [−5/2, 5/2] × [−5/2, 5/2] with
a Mat´ern covariance function of (3).
We generated data according to (1) with σ2 = 1,
ν = 1/2, α ∈{1/3, 2/3, 1, 4/3, 2}, and τ 2 ∈{0, 0.01}. In addition, we considered various
sample sizes n ∈{50, 100, 500, 1000, 2000} and two distributions for sampling locations,
including a uniform distribution and a clustered distribution with two clusters (see details
in Bandyopadhyay and Rao, 2017), resulting in a total of 5 × 2 × 5 × 2=100 combinations.
We compared our method with BR’s (Bandyopadhyay and Rao, 2017). The empirical
Type-I error rates under various settings for the uniform and the clustered distributions
are shown in Table 1 and Table 2, respectively. Although our method shows a few elevated
Type-I error rates when spatial dependence is strong, overall, the Type-I error rates are close
to the nominal level. On the other hand, the Type-I error rates for the BR’s method tend
to be too large for a few cases under the uniform design and too small for many instances
under the clustered design. The distributions of p-values for various scenarios under H0 are
11

Table 1: Empirical Type-I errors for our method and BR’s method (Bandyopadhyay and Rao, 2017) under
various scenarios with a uniform sampling design based on 500 simulated replicates.
n
Method
τ 2 = 0
τ 2 = 0.01
α = 1
3
α = 2
3
α = 1
α = 4
3
α = 2
α = 1
3
α = 2
3
α = 1
α = 4
3
α = 2
50
Ours
0.068
0.054
0.052
0.058
0.060
0.062
0.060
0.056
0.058
0.054
50
BR
0.030
0.020
0.040
0.050
0.090
0.030
0.020
0.050
0.050
0.080
100
Ours
0.034
0.042
0.044
0.054
0.058
0.034
0.046
0.044
0.056
0.050
100
BR
0.030
0.030
0.030
0.040
0.040
0.030
0.050
0.040
0.050
0.040
500
Ours
0.046
0.054
0.056
0.052
0.050
0.040
0.050
0.048
0.054
0.048
500
BR
0.020
0.030
0.020
0.030
0.030
0.040
0.080
0.070
0.130
0.120
1000
Ours
0.072
0.072
0.068
0.070
0.054
0.072
0.072
0.070
0.070
0.068
1000
BR
0.050
0.050
0.060
0.060
0.080
0.080
0.100
0.100
0.140
0.130
2000
Ours
0.078
0.066
0.064
0.064
0.072
0.074
0.074
0.068
0.076
0.080
2000
BR
0.070
0.060
0.090
0.090
0.080
0.070
0.090
0.090
0.180
0.180
Table 2: Empirical Type-I errors for our method and BR’s method (Bandyopadhyay and Rao, 2017) under
various scenarios with a clustered sampling design based on 500 simulated replicates.
n
Method
τ 2 = 0
τ 2 = 0.01
α = 1
3
α = 2
3
α = 1
α = 4
3
α = 2
α = 1
3
α = 2
3
α = 1
α = 4
3
α = 2
50
ours
0.064
0.060
0.074
0.072
0.078
0.058
0.06
0.070
0.076
0.072
50
BR
0.020
0.020
0.010
0.010
0.020
0.020
0.020
0.020
0.030
0.020
100
ours
0.082
0.074
0.072
0.062
0.054
0.080
0.078
0.064
0.074
0.068
100
BR
0.020
0.020
0.020
0.020
0.020
0.020
0.030
0.020
0.030
0.030
500
ours
0.060
0.052
0.058
0.060
0.056
0.064
0.052
0.052
0.060
0.060
500
BR
0.020
0.020
0.010
0.010
0.010
0.010
0.020
0.020
0.010
0.010
1000
ours
0.056
0.058
0.064
0.066
0.060
0.060
0.060
0.060
0.062
0.062
1000
BR
0.010
0.010
0.010
0.010
0.010
0.010
0.010
0.010
0.010
0.020
2000
ours
0.052
0.062
0.074
0.076
0.076
0.058
0.058
0.072
0.074
0.076
2000
BR
0.010
0.010
0.010
0.010
0.010
0.010
0.010
0.010
0.010
0.010
displayed in Figures A.1-A.4. They are all very close to the uniform distribution on (0, 1) as
we anticipate.
Next, we investigated the power of the proposed test following the same setups in Bandy-
opadhyay and Rao (2017). We considered three scenarios. In the first two scenarios, we
replaced the stationary Mat´ern covariance function of (3) by a nonstationary Mat´ern covari-
ance function with λ = 20 and 40, respectively:
cov(y(s1), y(s2)) = |Σλ(s1)|1/4|Σλ(s2)|1/4|{Σλ(s1) + Σλ(s2)}/2|−1/2 exp
 −
p
Qλ(s1, s2)

,
(17)
12

Table 3: Empirical powers for our and BR’s methods (Bandyopadhyay and Rao, 2017) under various scenarios
based on 500 simulated replicates.
n
Method
τ 2 = 0
τ 2 = 0.01
λ = 20
λ = 40
4 blocks
λ = 20
λ = 40
4 blocks
50
Our
0.050
0.058
0.074
0.054
0.060
0.076
50
BR
0.020
0.030
0.050
0.030
0.040
0.050
100
Our
0.060
0.044
0.106
0.058
0.048
0.100
100
BR
0.050
0.040
0.040
0.040
0.040
0.040
500
Our
0.340
0.136
0.570
0.328
0.140
0.548
500
BR
0.190
0.100
0.110
0.180
0.090
0.110
1000
Our
0.760
0.266
0.926
0.744
0.264
0.910
1000
BR
0.470
0.350
0.240
0.460
0.360
0.240
2000
Our
0.990
0.570
1.000
0.980
0.560
1.000
2000
BR
0.700
0.850
0.360
0.710
0.850
0.360
where
Qλ(s1, s2) ≡2(s1 −s2)′
Σλ(s1) + Σλ(s2)
	−1(s1 −s2),
Σλ(s) ≡



log
sx
λ + 3
4

−∥s∥2
λ2
∥s∥2
λ2
log
sx
λ + 3
4




1
0
0
0.5




log
sx
λ + 3
4

∥s∥2
λ2
−∥s∥2
λ2
log
sx
λ + 3
4



,
and s = (sx, sy)′. For the third scenario, we considered a zero-mean piecewise stationary
process {y(s) : s ∈D} by dividing D = [−5/2, 5/2] × [−5/2, 5/2] into 2 × 2 blocks of equal
sizes. The processes on four blocks are mutually independent and have the Mat´ern covariance
functions of (3), with σ2 = 1, ν = 1/2, and four different values of α ∈{1, 1/3, 1/2, 2/3} for
the four blocks. For each scenario, we considered the uniform sampling design and generated
data according to (1) with τ 2 ∈{0, 0.01} and n ∈{50, 100, 500, 1000, 2000}, resulting in 10
different combinations. The empirical powers are displayed in Table 3 based on 500 simulated
replicates. Except for a few cases in Scenario 2 with γ = 40 and n ≥1000, our method is
more powerful than the BR’s method in detecting spatial nonstationarity.
4.2. Spatial segmentation
We investigated the cluster recovery ability of the proposed method in spatial segmenta-
tion. We considered a region D = [0, 1]2 and decomposed it into D1 ∪D2 as shown in Figure
13

Figure 4: A partition of D into D1 and D2 and their corresponding spatial dependence parameters α1 and
α2.
4. We generated a zero-mean spatial process {y(s) : s ∈D} on D based on
y(s) = w1(s; a)η1(s) + w2(s; a)η2(s);
s ∈D,
(18)
where
wk(s; a) ≡
exp(−d(s, Dk)/a)
exp(−d(s, D1)/a) + exp(−d(s, D2)/a);
s ∈D, k ∈{1, 2},
are weight functions with a > 0 controlling the degree of smoothness for process y(·) around
the boundary between D1 and D2, d(s, Dk) ≡min
s∗∈Dk ∥s −s∗∥, and η(s) ≡(η1(s), η2(s))′ is a
zero-mean bivariate spatial process with a bivariate exponential covariance function:
cov(ηk(s), ηk′(s∗)) =
 2αkαk′
α2
k + α2
k′
1/2
exp

−(α2
k + α2
k′)1/2
21/2αkαk′
∥s −s∗∥

;
k, k′ ∈{1, 2}.
We generated data according to (1) and (18) with τ 2
= 0, α1
= 0.1, and α2
∈
{0.1, 0.2, 0.3, 0.4, 0.5}. Additionally, we considered n ∈{100, 500} and a ∈{0.01, 0.1}, result-
ing in a total of 20 combinations. Note that α2 controls the degree of nonstationarity. When
α1 = α2, we obtain y(·) to be a stationary process with cov(y(s), y(s∗)) = exp(−∥s−s∗∥/α1)
regardless of the value of a. By contrast, a larger departure of α2 from α1 indicates a higher
degree of nonstationarity. These features can be seen in Figure 5, which shows realizations
of y(·) with α2 ∈{0.1, 0.2, 0.3, 0.4, 0.5}.
We applied the proposed optimization of (11) to segment D into Voronoi subregions
 ˆD1, . . . , ˆDK
	
. We selected the final K according to BIC of (16). The performance of an
estimated clustering ˜D =
 ˜D1, . . . , ˜DK
	
is evaluated using the Rand index (Rand, 1971)
14

α2 = 0.1
α2 = 0.2
α2 = 0.3
α2 = 0.4
α2 = 0.5
Figure 5: Realizations of y(·) from models with various α2 values, where a larger α2 value corresponds to a
higher degree of nonstationarity, and α2 = 0.1 corresponds to a stationary process.
Table 4: Proportions of selecting the correct number (i.e., K = 2) of clusters under various situations based
on 500 simulated replicates.
α2
a = 0.01
a = 0.1
n = 100
n = 500
n = 100
n = 500
0.1
0.326
0.296
0.286
0.266
0.2
0.446
0.736
0.402
0.636
0.3
0.560
0.800
0.474
0.724
0.4
0.642
0.794
0.580
0.706
0.5
0.694
0.834
0.648
0.660
based on {s1, . . . , sn}:
R
 D, ˜D

≡
n00 + n11
n00 + n01 + n10 + n11
,
where D = {D1, D2} is the true clustering,
n00 is the number of point pairs that are in different clusters under both D and ˜D,
n01 is the number of point pairs that are in the same cluster under D but in different
clusters under ˜D,
n10 is the number of point pairs that are in different clusters under D but in the same
cluster under ˜D,
n11 is the number of point pairs that are in the same cluster under both D and ˜D.
Tables 4 and 5 show the proportions of selecting the correct number of clusters and the
average Rank Index values based on our method under various situations. As expected, our
method performs better for a smaller a and a larger n.
15

Table 5: Average Rand index values based on the number of clusters selected by BIC under various situations
based on 500 simulated replicates.
α2
a = 0.01
a = 0.1
n = 100
n = 500
n = 100
n = 500
0.1
0.563
0.547
0.556
0.544
0.2
0.603
0.746
0.582
0.663
0.3
0.650
0.845
0.609
0.748
0.4
0.689
0.884
0.660
0.778
0.5
0.735
0.905
0.686
0.793
5. An application to precipitation data in Colorado
In this section, we applied our method to a precipitation dataset in Colorado.
The
dataset can be obtained from the Geophysical Statistics Project at the National Center
for Atmospheric Research (http://www.image.ucar.edu/GSP/Data/US.monthly.met/CO.
html), which has been analyzed previously by Paciorek and Schervish (2006) and Qadir et
al. (2021). It consists of monthly total precipitation (in mm) recorded at 367 weather stations
across Colorado from 1895 to 1997. It is well known that Western Colorado is mountainous
with more significant topographical variability than Eastern Colorado.
Following Qadir et al. (2021), we considered the cumulative precipitations in the year
1992 and analyzed the data observed at 254 stations with no missing observations after
applying the log transformation. Figure 6(a) shows the precipitation data we analyzed.
We estimated τ 2 based on (8) and (9) by selecting a small d∗
1 and d∗
2 so that m1 =
m2 = 250. Applying the proposed test of (12) described in Section 3, we obtained a p-value
smaller than 0.01 for testing spatial stationarity, suggesting that the underlying process
is likely nonstationary.
We then segmented the process into stationary processes based
on subregions by applying the proposed spatial segmentation based on (11) introduced in
Section 2.3. From (16), we obtained the BIC values 359.75, 220.31, 213.82, and 228.96, for
K = 1, . . . , 4, respectively, where K = 1 corresponds to the stationary exponential model.
The smallest BIC value is achieved at K = 3. Figure 6(b)-(d) shows the segmentation results
based on K = 2, 3, 4. Even though we did not utilize any additional information (such as
16

(a)
(b)
(c)
(d)
Figure 6: (a) Precipitation amounts (mm in log scale) at 254 stations in Colorado in 1992; (b) Two subregions
obtained by the proposed methods; (c) Three subregions obtained by the proposed methods; (d) Four
subregions obtained by the proposed methods.
elevation) other than precipitations, Colorado Eastern Plains, which tend to have a different
climate pattern from the rest, are automatically segmented as a subregion for K ∈{2, 3, 4},
demonstrating that the proposed spatial segmentation method is effective.
We also investigated whether the proposed segmentation enhances spatial prediction. We
randomly split the data into training data {z(si) : i ∈Itrain} (consisting of 204 observations)
and test data {z(si) : i ∈Itest} (with 50 observations). Using the training data, we applied
the proposed spatial segmentation method (11) introduced in Section 2.3 with K = 1, . . . , 4.
Upon identifying the K subregions through our methodology, we conducted spatial prediction
17

by fitting an exponential covariance model to each subregion independently, operating under
the assumption that the data were generated from (1). Our approach considered y(·) as a
piecewise stationary process, in line with the decomposition. For every subregion, the model
parameters were estimated using Maximum Likelihood (ML). Subsequently, we harnessed
ordinary kriging from equation (2) to derive the predictive surface for each subregion. To
gauge the performance of our predictors, we utilized the root mean squared prediction error
(RMSPE) criterion:
RMSPE =
 1
50
X
i∈Itest
 ˜y(si) −z(si)
2
1/2
.
We also evaluated the performance of probabilistic forecast using the continuous ranked
probability score (CRPS, Geniting and Raftery, 2007):
crps(F, z) =
Z ∞
−∞
(F(t) −I(t ≥z))2 dt,
where F(·) is the predictive cumulative distribution function, z ∈R is an observation, and
I(·) is an indicator function. We computed the CRPS based on test data:
CRPS = 1
50
X
i∈Itest
crps ( ˜F(si), z(si)),
where for i ∈Itest, ˜F(si) is a generic predictive cumulative distribution function of z(si).
We randomly split the data into training and test data 200 times and obtained 200
predicted values and prediction standard deviations at each location. Figure 7 shows boxplots
of the RMSPE and CRPS values for K = 1, . . . , 4. Our method performs better than the
stationary model in terms of RMSPE and CRPS regardless of K = 2, 3, 4.
6. Summary
We develop a statistic to track nonstationarity by focusing on a microergodic parameter.
This innovation enables us to simultaneously detect changes in both spatial variances and
spatial ranges, from which we can segment the region into stationary components using
Voronoi tessellations. The proposed method is designed for data observed at irregularly
spaced locations without repeated measurements.
18

(a)
(b)
Figure 7: Prediction performances of the precipitation data in Colorado based on 200 pairs of randomly split
training and test data: (a) Boxplots of RMSPEs; (b) Boxplots of CRPSs.
Additionally, we introduce a novel test to detect the nonstationarity of a spatial process.
Our test is not only computationally efficient, but it also properly controls the Type-I error
rate, proving to be more powerful than existing methods. Compared to the test by Bandy-
opadhyay and Rao (1997), which tends to underperform with an irregular sampling design,
our test remains largely unaffected by the irregularity of data locations.
The proposed stationarity test offers another advantage: it can point out where the
nonstationarity occurs once rejected. As a result, we can perform kriging by applying a
stationary model to each component separately. It is also conceivable to take this further
by establishing a divide-and-conquer strategy to combine the results. These avenues present
promising research directions, especially when dealing with massive spatial data. Further
investigations along these lines, including the construction of nonstationary models based
on locally stationary processes and the development of scalable methods for kriging, are of
significant interest but fall beyond the scope of this paper. We intend to explore these areas
in future work.
19

Figure A.1: The distributions of p-values under H0 for various scenarios with τ 2 = 0 under a uniform
sampling design.
Appendix A.
In this section, we display the distributions of p-values for various scenarios under H0 in
Section 4.1.
References
Anselin, L. (1995). Local indicators of spatial association–LISA, Geographical Analysis, 27,
93–115.
Bandyopadhyay, S. and Rao, S. S. (2017). A test for stationarity for irregularly spaced
spatial data, Journal of Royal Statistical Society, Series B, 79, 95–123.
Cressie, N. (1993). Statistics for Spatial Data, rev. edn, Wiley, New York, NY.
20

Figure A.2: The distributions of p-values under H0 for various scenarios with τ 2 = 0.01 under a uniform
sampling design.
21

Figure A.3: The distributions of p-values under H0 for various scenarios with τ 2 = 0 under a clustered
sampling design.
22

Figure A.4: The distributions of p-values under H0 for various scenarios with τ 2 = 0.01 under a clustered
sampling design.
23

Cressie, N. and Hawkins, D. M. (1980). Robust estimation of the variogram: I, Mathemat-
ical Geology, 12, 115–125.
Fuentes, M. (2005). A formal test for non-stationarity of spatial stochastic processes. Jour-
nal of Multivariate Analysis, 96, 30–54.
Gneiting, T. and Raftery, A. E. (2007).
Strictly proper scoring rules, prediction, and
estimation, Journal of the American Statistical Association, 102, 359–378.
Guinness, J., and Fuentes, M. (2015). Likelihood approximations for big nonstationary
spatial temporal lattice data, Statistica Sinica, 25, 329–349.
Jun, M. and Genton, M. (2012) A test for stationarity of spatio-temporal random fields on
planar and spherical domains, Statistica Sinica, 22, 1737–1764.
Mat´ern, B. (1986). Spatial Variation, 2nd ed., Springer-Verlag, Berlin.
Muyskens, A., Guinness, J., and Fuentes, M. (2022). Partition-based nonstationary covari-
ance estimation using the stochastic score approximation, Journal of Computational
and Graphical Statistics, 31, 1025–1036.
Nidheesh, N., Nazeer, K. A., and Ameer, P. M. (2017). An enhanced deterministic K-
Means clustering algorithm for cancer subtype prediction from gene expression data.
Computers in biology and medicine, 91, 213-221.
Paciorek, C. J. and Schervish, M. J. (2006). Spatial modelling using a new class of nonsta-
tionary covariance functions. Environmetrics, 17, 483-506.
Qadir, G. A., Sun, Y. and Kurtek, S. (2021). Estimation of spatial deformation for nonsta-
tionary processes via variogram alignment. Technometrics, 63, 548–561.
Rand, W. M. (1971). Objective criteria for the evaluation of clustering methods, Journal
of the American Statistical Association, 66, 846–850.
24

Schwarz, G. (1978). Estimating the dimension of a model, The Annals of Statistics, 6,
461–464.
Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., and Knight, K. (2005). Sparsity and
smoothness via the fused lasso, Journal of the Royal Statistical Society, Series B, 67,
91–108.
Voronoi, G. (1908). Nouvelles applications des param`etres continus `a la th´eorie des formes
quadratiques. Premier m´emoire. Sur quelques propri´et´es des formes quadratiques pos-
itives parfaites, Journal f¨ur die reine und angewandte Mathematik (Crelles Journal),
1908, 97–102.
Zhang, H. (2004). Inconsistent estimation and asymptotically equal interpolations in model-
based geostatistics, Journal of the American Statistical Association, 99, 250–261.
25
